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mTRODUCTlON 



Jhis saper reviews the initial impact ^o£ the LaValle Test Disclosure 
Act on six testing programs administered by Educational Test;ing Service 
^for various sponsoring organizations* Table 1 lists the testing programs, 

"mm * 

sponsors, and program purposes* 

The tests that are discussed include two that are taken by high 
school students; 

Preliminary Scholastic Aptitude Test/Hational Mirlt Scholarship 

QuaUfying Test?. (^AT/NMSQT) ■ - 

Scholastic Aptitude Test (SAT) 
There is one test that may be taken either by high school or colli^ge 
level students applying for admission to post*secondary educati^ii in the 

I ■ . ' . . . . • V 

Test of English as a Foreign Language (TOEFl^> ' 
There are £hree tests for students In college college graduates who are 

yy ■ ^ 

'l>lanning to go on for further academic an^ p^fessional education: ' 
Graduate Management Admission X^^Z{QikT) 

y / ^ ' 

Graduate Record E3C33mlnatloiis>i^titude. Test (GRE) ^ 

// \ 
Law School Admission Tesc^LSM:) 




W// : • ■ ■ 



J 





Testing Prj 

Affected 



Adnlxiistered by Educational Testing Service , 
by Nev^York State Test Disclosure Law 
(LaValle Act) 



^ Exani nations 
^ 1* Graduate llanagem^nt 

Adnission Test (GHAT) 



2* Graduate Record l^xamina- 
tion Aptitude Test (GRE) 

^ 3* Law School Admission Test 
•* * ^ . 

(LSAT) 



4» Prclininary Scholastic , 

* 

^ Aptitude Test/National 

Merit Scholarship Quali- 
fying Test|(PSAI/NMSQTj 

\ 

5* Scholastic Aptitude Test 
(SAT) 



6* Test of English as a 
Foreign Language 
(TOEFL) 



Prograa. Sponsors ' 
Graduate Management 

« 

Admission Council * 

Graduate Record 
Exaolnatlons Board 



Lav School Adolssion 



CounclJ. . 

\ 

College Bo^d, National 
Merit Scholarship 
Corporation \ 



College' Board * 



Record Exaolnatlons 
Board, and Educational 
Testing Service 




^ Purpos^ of Exanlnations ^ 
5ervQS^\role in adaission to 
graduate study in manager^ 
oent 

Serves role in admission to 
* graduate study In art,s aod 
sciences 

Serves rol« In adolssion to 
study of law 

Serves role in guidance ancj 
^ as initial screen for 

National Merit Scholai:ship 
^Ejfogram 

Server role ia adolssion. to 
undergraduate sxud;^ ^ 



College Board, Graduate Serves role in both under^ 



graduate and graduate 
adolssions 
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IMHEDIATE OPERATIONAL EFFECTS ON TESTING PROGRAMS 

Increase in Nuaber of Test Fonas Developed 

Prior ^0 th^^passage of the LaValle Act in New York State it was 
possible to offer tests such as the Scholastic Aptitude Test (SAT) and the 
Graduate Record Exaoinations Aptitude Xksx! (GRE) on oany different occasions 

jjithout building a new test fota for each occasion. The individual test ^ 
forms regained '^secure". Ho student received copies of the test questions 
after the adolnlstration. Sample tests were avd^ilable to students » 
however, along with explanations of the test content, test development 
process, and ^^moses for the tests. / 
Secure testirig prn^rams could maintain a small inventory of test 

, forms, addCn^^^jrarfi n\iober of new forms efach year, and retiring from use 
the olcjest versions^ Th^ amount of money devoted ^to operational test 
development was a small part of the program budget. Addlt^idnal funds 
were used^^o support research on the design, development, and validation 
of new types of questions. 

^ Once the LaValle Act took affect in New York^State on January; 1, 1980 
any test cover^ed by the legislation had to .become public thirty days after 
scores were reportedf In order to preserve the small available inventory 

of* test forms, test program sponsors took such actions as reducing the 

* *■ 1 ' • ' 

number of test administrations in New York State. To maintain the testing 

' r 

programs over time, however, increases in test development h^d to be 
initiated* , . * ^ 

The two programs requiring the smallest increase in develbpment were 
the PSAX/lOlSQt and TOEFL. Paradoxically, these were the programs with the 
soallett and largest voluties of .test developnent prior to disclosurei The 



•: er!c 
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linterpretive 



PSAT/KMSQT, which is adcdnlstered d^^only two dates in a single week in 

October, continued to heed only two new test forms annually. TOEFL, which 

is adolnistered qonthly in states other thdn_New_YoTk_ State, was developing 

aany test foras per year even before padsage of the LaValle Act, This 

. high rate of test development for TOEFL was needed to maintain an inter- 

national testing program und^ secure conditions. Both the PSAT/NMSQT and 

TOEFL, therefore, needed only small increases in test developdjent work. 

^ /I 

Most of this additional work was related t6 the preparation of i 

• * 

materials to comply with LaValle. Information already containe^^ 

student bulletins and test manuals had to be* organized into appropriate 
# 

supplementary documents. 

For the S^, GHAT, GRE,* and LSAT the increases in workload/(7efe quite 
signifi^nt; There was about a 45% increase for the SAT a^lO new forms a 
year were necessary instead of the seven that had been dev^oped previously 
.The increase for the GHAT, GRE, and LSAX was 100 to 150%, asNu^r^hdn 
twi,ce as many n^ examinations as were needed in the past had to be 
developed each year. ^ ' 

This increase in admlssions-*related test development, moreover, 
occurred at the saroe tisj^ .that other changes in the design of some of the 
tests were being required because of the- Impact of disclosure on test 
equating. It was necessary to add new test development staff and to 
increase arrangements-*-f or outside help for test question \jjriting and 
review.^ Hajor bucdens were placed on experienced sVaf f in order to meet 
the increase in ^nev forms needed. Iirevitably, the programs affected by 
disclosure drev^ large amounts of staff time away fjroa other ongcring 
development activities^ Jt would .be difficult to. overstate the amount of 
disruption that rei^ulted*^ 



-6- 



V 



— — 4 <_ 

Acceleration of Development - of Equating Methods 

The LaValle Act required dl^clo^une for all Iteas contributing to a 

- . 

student's test score but excluded Items Included In a test for reasons of 
pretesting and equating* The Impact of the legislation varied froca 
program, to program depending on the method o£^ equating used previously. 
Table 2 summarizes the equating procedures employed by the six programs 
prior ^ to disclosure and Indicates the effect of the new legislative 
requirement. - - J - ' 

Th^ SAT used separate anchor test equating. Since* ancho*r tests do 
not contribute to the scpres of students, these equating subtests tests 
^can remain secure and the method can still be used* „ 

TOEFL used Item Response Theory -Equating. Items can b^ precallbSted 
b^faj^e^^ey^ contrl,b6te to students' scores* The precallbrated Items.can 
be used onl]^. once for equating pu^oses since they contribute, at that 
point, to the scores of students* The method can continue to be used. 



however, under t^st disclosure conditions. 



Tattle 2 



1 



Effect of LaValle Act on Equating Method Used 



Bd Wr 1 



Prior to Disclosure for Six Toting Programs Administered^^ ETS 

Equating Method Used ' Possible to Continue Hethod? 



Examination 
SAT 

iSOEFL • 
PSAI/NMSQT ' 



SeparSte Anchor Test 
Item Response Theory 
Embedded CommoK Items 



r\ Yes — 
/ Yps ~ 



Yes — No problem 
No problem' 



For a while ~ ad IjSng as 
ui)d£sclo8ed old SAT forma 
are available 



GMAT, GRE Aptitude, 
and LSAT 



Spiralling — New form No ~ Old and new forms * 
givfti with one or 'will be disclosed 

more gld forms 

n 

\ 
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The^PSAT/NHSQT was equate^hrough embedded common'. Items froci rec^lred 
fom9 of the SAT. Since these Iteas do contribute. to the, scores of 
students^ the^ethod can He used only as long as undisclosed forks' of the 
SAT are available. / ' . * 

Fdr the GR£ Aptitude Test, GMAT,. and LSAT the method of eq\iating that 

had been employed is called "spiralling". Spiralling requires the admlnis-x 

\ _ — — . — 

tration along with th^new test, of one or more o].d test forms that are 

• # 
already on the test score scale. 'At a GRH^Aptitude administration where a * 



new test form was being introduced, one-half the students might take this 

new test and one-ha^f 'an o^d test. Since tests were assigned randomly to 

students, differences between the raw scores on the two £^sts could be 

attributed to the differences in the relative, difficulty of the tests and 

appropriate eqtiating adjustments made* Under this model, though, t'he 

* entire old test must remain secure* witih disclosure one can use a test . 

only once. Then it must be disclosed and thus rendered unusable for 

future equating* For the GRE, (MAT, 'and LSAT, therefore,, the LaValle /Act 

struck down the existing met^'hod of equating* 7 

, Clearly alternative equating models had to be empl<9ed. In the case 

of the GRE the program 'continued to employ the spiralling method at 

nondisclosed administrations in states otjier than New York, but also moveti 

to develop sets of separate anchor tests for each of its tfiree scores 

• . I 

analytical, jsathematical, and Verbal* (These tests re'^resented additional 

s 

te$t development work and expense in a program wher^ the ^taff involved 

* * • > 

Here also facbd with the additional work required ^o increase total test 

form production*) The format of the test was changed to equA^ize aection 



ERIC 



r. . 



8 



.tr - — 



timing and the, total t^est time was'increased so that equating sections 
could be adm/nistered routinely aa -patf of the programt 

Reducing the number of administration dates in New York State and 
creating the potential for. anchor .test dqiiating provided the GRE program ^ 
with the time deeded to plan and carry out additional experimentationt 
Nondisclosed.adalnist rat ions can be used to compare the result^ obtained 

• I • ' ' . . 

by spiralling,^ anchor ^est equating, and the use of item response theory 
^equating. 

^ Trie 'choice of an alternative equating c^thod for the GMAT and LSAI is 

,a more difficult problem. The heterogeneous nature of these examinations 

' • ' 

argues against the use o% either anchor test equating or item response 

theory equating until ^additional experimental woi% is carried out. 

Research is needed extending across several admlnlstratidnB and candidate 

■ ) ' 

samples. Although research or XRT equating for these tests has Veen 
proposed, it could not.be carried out within the developmental schedule « 
needed to insure compliance with the LaValle legisliftion. 

For the. GMAT and LSAI, as well as other testing programs, the program 

sponsors and ETS have the responsibility either to us^ psychometric 

- <^ 

^thods that assure that scores 'from different, test forms have the same 
meaning or, if there must be a break in the continuity of the score scal^, 
to alert score lasers about the change. in the meaning of scor^s^ In either 
cbse^ st&ps must be taken to. avoid* misinterpretation of test scores*. 
Breaking the score scale for a program may be necessary to permit the\ 
introduction of a new equating method, but that method or a comparable one 
must then ^bj|. used to place future test forms on the new score scale. 



^ For both the GUAT. and LSAT we are currently experiment In^ with a ^ 
newly developed prooedure known as section, ^reequating. This ;nethod 
requires the adainistration of a final test experin^ntally in a series of 
pairs of sections prior to tihe "operational use of the test^ Since sectiof 

: • ■ , • . ' ^ 

pxeeqtiating experimentation has had to take tplace at the same tim^ as' a 
major pretesting effort for the large ampunt of new ly^ written test ■ 
questions^ as many a^ 40 di'fferent versions of test sections that do, 
nqt count toward the reported score .have had to be included with the 
operatio^l test used at a particular admioistration* The logistics . 
of developing ^sucl^ a complex system and seeing it thrbugh produ<;tion|7 ^ 
administration^ and analysis have been formidable* ' - ^ 

As 'pa^t of the adaptation ^ the GMAT to the section pre-equating 
method^ the \est has been reformatted so that each section has the same 
timing. The total time for the GMAT hal also been increased, initially by 
one-half hour» but with a future time increase also to be required. The. 
to^al increase in tizto'wiirbe SS'minutes* « 

For 'the LSAT taore significant changes in t^st content are being 

• 

undertaken and the program sponsor » the Law School^Admission Council » has 

decl4j&d to'^troduce a new score scale. The use of a'new score scale will 

make it oleac.to score users that the nature of the test has undergone a 

major ctldnge. Scores^ on the n^w score scale mMX not be linked to scores 

on t'he old scale. ^ ^ 

Although the development of a new equating method ought to have years 
< 

of research to^confiim its effectiveness and explore its ifflt)lications, thd 
LaValle Act has greatly. reduced the lead time available for such work. 
While the experimentation underway will almost certainly *hasten the use of 

. ' 10 ■ • 



^ew. equating tools » it is hard to ignore the pofren^al dangers of 'having 

to follow a developmantal' timetable ^at is based on a legislative mandate. 

A oanda'te, moreover, that makes little provision for. the research ^nd ' 

» ■ ! * > 

analysis that.ls ^onsi^tet^t with high x^asurem^nt standards'/ ' 



Filing Requirements and Interpretive Materials' 

The LaValle legislation required the filing of background reports 
• • * 

and statisticaf data regarding the affected tests with the Commissibner of 

the State of' New York* In tbe de^te over the need for the LaValle 
• • • 

legislation, £TS and other test publishers argued that high quality 

technical and interpretive material was readilyf^vailable, while critics 

of testing contested this claim. »^See Sttenio, ^1979, ACT, 1980, and 

Browii, 1980.) ' < 

The 1979 J^^alle Act was ammended in '1980 in a manner that protected 

institutional privacy and which provided some flexibility as to the^int . 

* 

In the testing process at which information could be provided to candidates 
It proved possible for ETS'to meet , the filing requirements of LaValle • 
througl) th^ .submission of materials already available to interested 
parties prior to the legislatioiv^ Reports from Hew *!fork State ^tidicate 
that the lar^e amount of materials that i& now on file with, the Department 
of Education has recej^ved little use. Fot example, there appear to have 
been no inquiries tp Educational Testing Service during the first 18 
months of the LaValle Act (January 1,1980 to June 30, 1981) that are 
related ta the technical materials filed with the Comjniss loner* 

\ 

• if ^ 

In addition to filing materials with the Commissioner of Education in 
i ^ ... 

Albany, New York, the test program sponsors provided additional maVerial ^ 



to te3t takers who requested copies of their test bobkjLets« 'In nbst 
instances the oa^rlal accoapanying the booklet was relatively brief ^ 
providing, for example, an explanation of the procedure used' to derive a 
scaled slore« ^ 

In. the TOEFL program an additional package of materials ''Understanding 
TOEFL: Test Kit 1/*^ going beyond the requirements of the legislation, was 
also prepared* The T^st Kit contains a complete TOEFL test along with an 

explanation of «ach question in the test and of the four answer choices 

■ t 
for each question. The very large response to the Test Kit makes clear 

that this publication JLs filling an important inf o'rmatLOn need^ 

GROUNDWORK FOR FUTURE CHANGES . ' 
Altering One of the Basic Conditions ot Testing Programs 

The greatest impact of test disclosure at ETS may, stem (from the 
change in altitude it is bringing about In the contributors to test 



I in at:titude it 1& 

. ■ / 

►pment 'fSf'Tuttddnal 



development 'fSf'TuttJ^nal admissions testing programs* * The concept Of test 
security ^s been central to many aspects of the development of maj^r 
tes ting i programs that are providing ^infonJatibn for individuals and 
institutions* Now security of te^t questions can be preserve<i only up 'to 
the time 'of ^e first operational us^e of a test* 

^ The Initial accommpdations to disclcrsure thqt 4re , described earlier . 
in this paper peimlt testing to continue j^hile more attention. is given to 
additional alternatives* Once t^st development and equating reach a 
stable state, it is likely that, the entire test development and analysis 
process at ETS will be analyzed and redesigned/ ^>lajor change is predictably 
because^ test disclosure, although a major fo^ce, ts not the only' "pressure 



■ ■ • / ■ . - 

fqv 'innovation on' Che Ce^C develapme^t process. Advances in technology 
and in oe'^sareme^t theory also push ^n our current n^thods. As a result^ 
at Educational Testing Service/ groups o^ staff* representing different 
.atrpas of specialization and interests are collaborating on reviews of ou? 

■ • ' . - • ■ '• '-3 

.most traditiQn*bound assunptions. . FundadeiSHil changes iA measurement 
practice appedr to-^f^both possible ai}d desirable. ' * ' 

.ftroadehing"^ Input <p. thfi Test Development PtoAss r 

theme t;hat ciics acros^ joany* bf the emerging trends In tj&st 

* • 

' W^'' * t* ' ^ ^ **** > * 
development at ETS' is thsft of 'additiottkl^ external ipvolven^ht. . The 

* * . , ; ^ ^ . ' 

^participation of educators ^froo^ schools and ,calleges in tofe development of 

* examinations has^long been a standard feature of our wprk. >There is 



increa&itig evidence^ thought of £T^ and program, s^ponsor interest in 

3^. ' • ' • * . ' - , - ' ^ . ■ ^ 

broadening s£ill fufcjier the role played by such external contributors* 

Such actions ar^ vievred as ways of demonstrating' out commitment to public 

accountability and our^ opejiness to the ideas ^f those impacted by testing. 

^ ' A]39ng the kinds of activitleEf that support th^s generalisation are 

the following': . ^ ' ' . 

• An increase in the mimlybr of opportunities for students to sit 

. with subject'-^tter committees to discuss the experience of fra^ng 

a te^t that had been developed with the help of the committee. 

■ * ■ • * 

# The establishments by test prcfgram sponsors of Test Question Review 
.Commlctees with responsibility ^for reviewing each question before 
it appears on a final test. .(A practice dl'ready^in place for many 
major tests prior to test disclosure legislation.) 

, • The development of a specifications aurvey model to provide additional 
Input into the detenainatioh of t-est ^ontent specifications » and 



^ . to evaluate the match between test questions aria the associated 
specifications' 

#. Thf fuxmation, by the ETS Sepior Viae Prfesiden^ for Testing Programs, 
of .anVeAernal advisory gr<xip» This gtoup includes among its members 
representatives of studfent associatV^ns, representatives qf other 
educational groups, and individuals prominent in education^ research 
and measurement* ; ' 

External input to ETS test devdFlopment includes letters from test 
takers and the media coverage given to our work* In some instances our • 
activities have even moved from no news coverage at all to the education 
pages of newsi^apers to page 'one* Such attention does provide an opportunity 
to explain what we believe to be the'strengths ait4 weaknesses of our, tests 
to quite a wide aydience* If the result is that readers and listeners end 
up with a balanced picture, it may be all to the goQd* Joraer ETS President 
Villlto Turnbull noted that educational testing is in greater danger from 
the zealots who perpetuate fallacies, such &s ^he notion that tests 
measure With infallible precision, than f rem , our critics (Turn&ull, 1978)* 



^yone who^s ^^^Jj^M^^ take strong crit^0to h^s^no business workings 
in a*^ield where Ij^^v^^^^ through criticism is so fundamental* 

Pd^sible Changes in the Nature of Tests 

It is my hope that the thorough shaking which test disclosure has 
giv&n to testing and the continuation of the trend toward greater external 
involvement will result in a number 6f desirable modifications to testing* 
My jreading of the kinds of input we are getting suggests that we will be 

facing increasingly greater pressure' not only to provide evidence of ^ 

' \ ' S 

validity for our tests for their intended purposes but al^o to Justify 

' - * j.^ , , 
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each question that we include? on, a test. Lt will not be enough to explain 
the steps lea^llng* to- specif lea t Ions caUi^g for a question of a particular 
type. We will be called upon to provide evidence that the test questions 
« measure the cj^nstruct that the' test Is designed to masure. 

Not orily^do I expect attention to extend to individual test questions 
but also to 'the answer choices for these questions. What is the basis ,/ 
for giving credit^© a particular answer in scoifing? , How do the test 
directions, the descriptive material, and the question Itself establish ^ 
the context for identifying that-response ^s the ^ey? In tljp past, in 
^ highly structured areas suph as mathematics, we h^ve argued the self 
eviilent corrtctness" of the keyed answer to -«fose with the appropriate 
nathematical skill* Exp,eriences with tjuedtion challenges, though, make it 
clear that variations In the itttetpretatlon of langua^ can exist In any 
discipline. >^ It is the 4:est dev^lopia^nt process Itself that must serve as 
the basis for accepting as appropriate a test with its associated key* To 
reach that end, the process must permit the test^ takerf^and other Interested 

S r ' * 

people a chance to "ch'allenge the experts" and be heard. ' 

Despite my interest*, and that of my ETS c611eagues, in atteaptlnlg to 
extract some benefit in jthq future for education and testing from disclosure, 
I see some further nfe^ative outpomes looming ahead. X believe, for etample, 
that the 1980 'extension of the LaValle Act to cover achievement examinations 
• is likely to result in the^ qpmplete elimination, in all states, of certain 
of these examinations^ As of April 1981, only six of ther twenty GRE . 
Advanced tests are offered in New York State* The loss^of Hew York 
testing volu ge forb orne of the remaining tests jhat already have small 
volumes further reduces their economic viability* Since small volume 

15 



and 



\ . . . 

' . ■ .1 

tests also pose problems in obtaining equating samples » at a minimum^*! j 
expect to see a deduced number of crfferings nationally. 

Another possible negative consequence of disclosure 'is a shif f i 

test question content away from lao re. imaginative and insightful type 

/ . ^ ' ^ " , 

'towards convencloiud forms. This type of shift could occur in an.'' effort 

to ayoid questions that susceptibler^ criticism. At worst we ould 

''end up with tests of >n mathematical skills that emphasized what one of my 

ETS colleagues balls prat-tat-tat mathematics. The test development " ttaff 

at ETS are working? ^th external contributors in an a.ttempt to develop 



procedures" that wilj,, permit the test development enterprise to continue tb 
evolve in positive , directions without being trapped into t^e kind ^ " , i 
content limitations' tliat are a potetrtial danger. ' ' . , ' 

• , V * • J 

Closing Cooaent * ^ ^ ' ; ^ ' 

This paper wtlls attention, to th^ ma^or impacts on test development 
volumes and equating procedures' at ETS as -a restlt ht test disclostlre. * It 
is possible that the public attention that testing is receiving and the' 
increase in external , assistance in test planning and deve^lopment Will lea<f 
to more positive^ f\}tute Effects. Aa an ETS test developer I believe ►t^at ' 
I need to listen qarefuUy to the many kirMs of suggestions that are ^ing 
GQ(ade and to be willing to challenge *past assumiJtions. As I setf it neither 
the critics nor*.t1ie proponents of testing are urging at^andonment of ^ 

* ' i 

high measutcaent^'st^'ndards, or of commitment £6 fairness in testing .'And 

- ^ ^ ^ ^ ' - ' ' 7 

test use; what is being talced for, though, is a willingness tcrfexpflore - 

' V • ' • ^ V - ' / / 

alternative routes^ to these sade ends.^ 
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